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Description 

FIELD OF THE INVENTION 

5 This invention relates to a biologically pure DNA signal sequence which encodes an amino acid signal 
peptide necessary for directing the secretion from certain defined hosts of proteins in bioactive form. 

BACKGROUND OF THE INVENTION 

70 In the biological production of commercially viable proteins by the fermentation of microorganisms, the 
ability to produce the desired proteins by fermentation with secretion of the proteins by the microorganisms 
into the broth is very significant. However, there are many commercially viable proteins encoded by 
genetically engineered DNA constructs which are not secreted by the cells in which the DNA is expressed. 
This often necessitates harvesting the cells, bursting the cell walls, recovering the desired proteins in pure 

75 form and then chemically re-naturing the pure material to restore its bioactive function. This downstream 
processing, as it is called, is illustrated in Figure 1. 

Some cells and microorganisms carry out the biological equivalent of downstream processing by 
secreting proteins in bioactive form. The mechanism which directs the secretion of some proteins through 
the cell walls is not fully understood. For example, in Streptomyces griseus . an organism used for the 

20 commercial production of Pronase. the species secretes many extra cellular proteins (Jurasek, L.. P. 
Johnson, R.W. Olafson. and L.B. Smillie (1971). An improved fractionation system for pronase on CM* 
Sephadex Can. J. Biochem., 49:1195-1201). Protease A and protease B. two of the serine proteases 
secreted by S. griseus , have sequences which are 61% homologous on the basis of amino acid identity 
(Fujinaga. M.rL.T.J. Delbaere, G.D. Brayer. and M.N.G. James (1985), Refined stmcture of a-lytic protease 

25 at 1.7A resolution: Analysis of hyrodgen bonding and solvent structure . J. Mol. Biol., 183:479-502; Jurasek. 
L. M.R. Carpenter, LB. Smillie, A. Gertler, S. Levy, and LH. Ericsson (1974). Amino acid sequencing of 
Streptomyces griseus protease B. A major component of pronase . Biochem. Biophys. Res. Comm.. 
61:1095-1100; Young, C.L, W.C. Barker. CM. Tomaselli. and M.O. Dayhoff (1978). Serine proteases . In 
M.O. Dayhoff (ed.). Atlas of Protein Sequence and Structure 5 . suppl. 3:73-93). These proteases also have 

30 similar tertiary structure, as determined by X-ray crystallography (Delbaere. L.T.J., W.L.B. Hutcheon. M.N.G. 
James, and W.E. Thiessen (1975), Tertiary structural differences between microbial serine proteases and 
pancreatic serine enzymes . Nature 257:758-763; Fujinaga. M., LT.J. Delbaere. G.D. Brayer, and M.N.G. 
James (1985), Refined structure of a-lytic protease at 1 .7 A resolution ; Analysis of hyrodgen bonding and 
solvent structure , J. Mol. Biol.. 183:479-502; James. M.N.G.. A.R. Sielecki, G.D. Brayer, L.T.J. Delbaere. and 

35 C.-A. Bauer (1980). Structures of product and inhibitor complexes of Streptomyces griseus protease A at 
1.8. A resolution . J. Mol. Biol.. 144:43-88). Although the structures of proteases A and B have been 
extensively studied, the genes encoding these proteases have not been characterized before. EP-A-0 222 
279 discloses signal peptides derived from Streptomyces . 

40 SUMMARY OF THE INVENTION 

In accordance with this Invention, the genes encoding protease A and protease B of S. griseus have 
been isolated and investigated to reveal DNA sequences which each direct the secretion of an encoded 
protein fused either directly or indirectly to a signal peptide encoded by the DNA. 
45 According to an aspect of the invention, a recombinant DNA sequence comprises a signal sequence 
and a gene sequence encoding a protein. The recombinant DNA sequence, when expressed in a living cell, 
encodes an amino acid signal peptide with the protein. The signal peptide directs secretion of the protein 
from a cell within which the DNA signal sequence is expressed. 

According to another aspect of the invention, a biologically pure isolated DNA signal sequence encodes 
50 a 38 amino acid signal peptide which directs secretion of a recombinant gene-sourced protein linked to 
such 38 amino acid signal peptide, from a cell in which the DNA signal sequence is expressed. The DNA 
signal sequence is isolated from Streptomyces griseus . 

According to another aspect of the invention, the DNA signal sequence in conjuriction with a gene 
sequence encoding a protein Is inserted into a vector, such as a plasmid or a phage. 
55 According to another aspect of the invention, the DNA signal sequence is adapted for expression in a 
living cell having enzymes catalyzing the formation of disulphide bonds. 

According to another aspect of the invention, the biologically pure isolated DNA signal sequence of 
Figure 4a. 
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According to another aspect of the invention, the biologically pure isolated DNA signal sequence of 
Rgure 5a. 

According to another aspect of the invention, a fused protein is encoded by the recombinant DNA 
sequence of Figure 4 or Figure 5. 
5 According to another aspect of the invention, a transformed prokaryotic cell is provided which has 
inserted therein a suitable vector including the recombinant DNA encoding the signal protein. The 
transformed prokaryotic cell may be selected from the Streptomyces genera. 

According to another aspect of the invention, a biologically pure culture has a transformed prokaryotic 
cell with the recombinant DNA sequence in a suitable vector. The culture is capable of producing, as an 
10 intermediate, the fused protein of the amino acid signal peptide and the protein. The protein itself is 
produced in a recoverable quantity upon fermentation of the transformed cell in an aqueous nutrient 
medium. The signal peptide directs secretion of the protein from the cell. 

According to another aspect of the invention, a biologically pure culture, transformed with the functional 
signal sequence as described above, is able to direct the secretion from the cell of proteins whose 
75 bioactivity is dependent upon the formation of correctly positioned intramolecular disulphide bonds. 

A biologically pure DNA sequence encoding a fused protein including protease A has the combined 
DNA sequence of Rgures 4a, 4b and 4c. 

A biologically pure DNA sequence encoding a fused protein including protease B has the combined 
DNA sequence of Rgures 5a, 5b and 5c. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 

With reference to the Rgures. a variety of short forms have been used to identify restriction 
endonucleases, amino acids, deoxyrt)onucleic acids and related information. Standard nomenclature has 
25 been used in identifying all of these components as are readily appreciated by those skilled in the art. 
Preferred embodiments of the invention are described with respect to the drawings, wherein: 
Rgure 1 illustrates downstream processing; 

Rgure 2 shows restriction endonuclease maps of DNA fragments of sprA and sprB; 
Figure 3 illustrates restriction endonuclease maps and sequencing strategies in sequencing DNA 
30 fragments containing sprA and sprB; 

Figure 4 is the DNA sequence of sprA; 

Rgure 4a is the DNA sequence encoding the sprA (protease A) signal peptide; 
Rgure 4b is the DNA sequence encoding the spr A (protease A) propeptide; 
Figure 4c is the DNA sequence encoding mature protease A; 
35 Figure 5 is the DNA sequence of sprB; 

Figure 5a is the DNA sequence encoding the sprB (protease B) signal peptide; 
Rgure 5b is the DNA sequence encoding the spr B (protease B) propeptide; 
Rgure 5c is the DNA sequence encoding mature protease B; 

Rgure 6 is an alignment of the amino acid sequences deduced from spr A and sprB to develop homology 
40 between the two sequences. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The organism Streptomyces griseus is a well recognized microorganism. It is commercially used for the 
45 production of the enzyme Pronase. It Is appreciated, however, that this organism also secretes two 
enzymes, protease A and protease B. which are both serine proteases. Although the structure of proteases 
A and B have been extensively studied, the genes encoding these proteins, and the manner in which this 
genetic information Is used to signal secretion by the cells, is not understood. According to this invention, 
the genes which encode protease A and protease B and provide for the secretion of these proteins in 
so bioactive form have been discovered. It has been determined that each of protease A and B is Included in a 
precursor protein which is processed to remove an amino-terminal polypeptide portion from, the mature 
protease. It has further been determined that each of protease A and B precursor proteins is enzymatlcally 
processed to fonm correctly-positioned intramolecular disulphide bonds, which processing is concomitant 
with removal of the amino terminal addressing peptide from the mature precursor. The discovered genes, 
55 which encode proteases A and B, their intermediate address-competent forms, and their control elements, 
have been designated spr A and sprB. 

As discussed In the following articles, Jurasek, L, M.R. Carpenter, LB. Smillie, A. Gertler, S. Levy', and 
L.H. Ericsson (1974). Amino acid sequencing of Streptomyces griseus protease B. a major component of 
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pronase ., Biochem. Biophys. Res. Comm. 61:1095-1100; Young, C.L. W.C. Barker, CM. Tomaselli, and 
M.O. Dayhoff (1978). Serine proteases . In M.O. Dayhoff (ed.), Atlas of Protein Sequence and Structure 5 , 
suppl. 3:73-93, proteases A and B are homologous proteins containing several segments of identical amino 
acid sequence. In accordance with this invention, the genetic code, which makes and directs the secretion 

5 of each of proteases A and B. has identical DNA sequences corresponding to the regions of identicality for 
the homologous proteins proteases A and B. In order to isolate the genes, this assumption, that identicality 
in portions of the gene sequences would occur, was made so that an oligonucleotide probe could t>e 
designed from one of the similar regions in the sequences. 

In order to extrapolate the gene sequence which would encode the similar amino acid sequence, the 

70 known codon bias for Streptomyces was relied upon to develop the nucleotide probe (see Bernan. V., D. 
Filpula, W. Herber, M. Bibb, and E. Katz (1985), The nucleotide sequence of the tyrosinase gene from 
Streptomyces antibiotics and characterization of the gene product . Gbn^Wftfirl 10; Bibb. M.J.. M.J. Bibb, 
J M Ward. S.N. Cohen (1985), Nucleotide sequences encoding and promoting expression of three antibiotic 
resistance genes indigenous to Streptomyces .. Mol. Gen. Genet. 199:26-36; (Thompson. C.J., and G.S. 

75 Gray (1983), Nucleotide sequence of a streptomycete aminoglycoside phosphotransferase gene and its 
relationship to phosphotransferases encoded by resistance plasmids , Proc. Natl. Acad. Sci. USA 80:5190- 
5194). Once the probe was constructed, it was then possible to probe the DNA sequences of S. griseus to 
determine if there were any corresponding nucleic acid sequences in the microorganism. Since it was 
known that there were two proteases. A and B. the oligonucleotide probe should have revealed two DNA 

20 fragments detected by hybridization analysis, and in fact, not only did the probe hybridize equally to two 
fragments generated in the genomic library of S. griseus , but also two fragments generated by BamHI 
digest (8.4 kb and 6.8 kb) or Bglll (11 kb and 2.8 kb) were isolated from the genomic library. As a cross- 
check with respect to the predictability of such probe, the same fragments were detected in genomic DNA 
libraries of other isolates of S. griseus . It was noted, however, that there was no such hybridization of the 

25 oligonucleotide probe with DNA from other Streptomyces such as S. iividans . 

Plasmids were constructed containing digested fragments of S. griseus The oligonucleotide probe was 
used to isolate developed plasmids containing sprA and sprB. The screening by use of the probe was 
accomplished by coloriy blot hybridization where approximately 15.000 E. coli transformants containing the 
developed plasmids were screened. Twelve transformants were detected by the probe and isolated for 

30 further characterization. These colonies contained two distinct classes of plasmid based on restriction 
analysis. As determined from the hybridization of genomic DNA, the plasmids contained either the 6.8 kb or 
the 8.4 kb Bam HI fragment. These fragments contained the sprA and sprB genes. 

The fragments as isolated by hybridization screening were tested for the expression of proteolytic 
activity. With these plasmids identified, such characterization may be accomplished in accordance with a 

35 variety of known techniques in accordance with a preferred embodiment of this invention. 

The 6.8 kb and 8.4 kb Bam HI fragments were ligated into the Bglll site of the vector plJ702. 
Transformants of S. Iividans containing these constructions were tested on a milk plate for secretion of 
proteases. A clear zone, which represented the degradation of the milk proteins, surrounded each 
transformant that contained either Bam HI fragment It was noted that the clear zones were not found around 

40 S. Iividans colonies which contained either pU702 only or no plasmid construct. 

"~ Proteolytic activity was also observed when the Bam HI fragments were cloned in either orientation with 
respect to the vector, thereby minimizing the possibility of read-through transcription of an incomplete 
protease gene. This observation provides evidence that the two Bam HI fragments contain an intact protease 
gene which is capable of effecting secretion in a different Streptomyces species, as for example the S. 

45 Iividans. With this particularly relevant characterization of the Bam HI fragment, and knowing that the desired 
gene was in these fragments, it was possible to isolate and to sequence the genes encoding protease A 
and protease B. 

According to a preferred aspect of this invention, the particular protease gene contained within each 
cloned BamHI fragment was determined by dideoxy sequencing of the plasmids using the oligonucleotide 

50 probe as a primer in such analysis. The 8.4 kb Bam HI fragment was found to contain sgrB. because a 
polypeptide deduced from the DNA sequence matched a unique segment of the known amino acid 
sequence of protease B. The 6.8 kb Bam HI fragment contained the spr A by process of elimination. The 
protease genes in these fragments were localized by digesting the plasmids and determining which of the 
restriction fragments of the plasmids were capable of hybridizing to the oligonucleotide probe. 

55 Figure 2 shows detailed restriction maps of the 6.8 kb and 8.4 kb Bam HI fragments. Hybridization to the 
oligonucleotide probe was confined to a 0.9 kb Pvull-StuI fragment of sprA and a 0.6 kb Pvull-Pvul fragment 
of sgrB. Such hybridization is indicated by the heavy lines in Figure 2. Hybridization to the cloned Bam HI 
fragments and the 2.8 kb Bglll fragment of sprB agrees with the hybridization to Bam HI and Bglll fragments 
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of genomic DNA. Thus, rearrangment of the Bam Hl fragments containing the protease genes is unlikely to 
have occurred. 

The functional portions of the sprA- and sprB-containing DNA were determined by subcloning restriction 
fragments thereof into plJ702. The constructed plasmids were transformed into S. lividans and tested for 

5 proteolytic activity. The 3.2 kb BamHl- Bglll fragment of spr A and the 2.8 kb Bglll fragment of sprB, when 
subcloned into plJ702 in either orientation, resulted in the secretion of protease from S. lividans . The intact 
protease genes were further delimited to a 1.9 kb StuI fragment for spr A and a 1.4 kb BssHII fragment for 
SprB. With reference to Rgure 2, each of these functionally active subclones are indicated below the 
restriction maps which contain the region for each gene which hybridized to the oligonucleotide probe. 

70 In order to determine the nucleic acid sequence of the protease genes, the 3.2 kb Bam HI-Bglll fragment 
of sprA and the 2.8 kb Bglll fragment of sprB were subcloned into pUC18 to facilitate further structural 
characterization. As shown in Figure 3. the restriction maps of these subclones and the strategies which 
were used to sequence the 1.4 kb Sajl fragment containing spr A and the 1.4 kb BssHII fragment containing 
sprB are shown. The resultant DNA sequences of spr A and sprB are shown in Figures 4 and 5. 

T5 respectively. The predicted amino acid sequence of protease A differed from the published sequence by 
the amidation of amino acid 133. whereas that of protease B was identical to the published sequence, (see 
Fujinaga. M., L.T.J. Detbaere, G.D. Brayer, and M.N.G. James (1985). Refined structure of a-lytic protease 
at 1.7 A resolution ; Analysis of hyrodgen bonding and solvent structure . J. Mol. Biol. 183:479-502). 

Analyzing the sequences of Figures 4 and 5. it is apparent that each sequence contains a large open 

20 reading frame with the coding region of the mature protease situated at the 3' end. For the protease A and 
protease B genes, the sequence encoding the carboxy-terminus of the protease is followed immediately by 
a translation stop codon. At the other end of the sequence, the predicted amino acid sequences appear to 
extend beyond the amino-tem)ini of the mature proteases A and B by an additional 116 amino acids for 
sprA of Figure 4 and 114 amino acids for sprB of Figure 5. The putative GTG initiation codons at each of 

25 these positions (-116 for Figure 4; -1 14 for Figure 5) are each preceded by a potential ribosome binding site 
(as indicated by the series of five dots above the sequence) and followed by a sequence which encodes a 
signal peptide. The processing site for the signal peptidase (identified by the light arrow in Figures 4 and 5) 
is predicted at 38 amino acids from the amino-terminus of the putative precursor. [For clarity, that part of 
the nucleic acid sequences of Figures 4 and 5 corresponding to the signal peptide portion of sprA and spr B 

30 is reproduced in Figures 4A and 5A, respectivelyj. The propeptide is encoded by the remaining sequence 
between the signal processing site (light arrow) and the start of the mature protein (indicated at the dark 
arrow). [For clarity, that part of nucleic acid sequences of Rgures 4 and 5 corresponding to the propeptide 
portion of sprA and sprB is reproduced in Figures 4B and 5B, respectively]. The mature protease is 
encoded by~the codon sequence 1 through 181 for Figure 4 and 1 through 185 for Figure 5. [For clarity. 

55 that part of the nucleic acid sequences of Figures 4 and 5 corresponding to the mature protein portion of 
sprA and sprB is reproduced in Figures 4C and 5C, respectively]. The amino acid sequence for codons 
^6 through^ +181 of Rgure 4 and the amino acid sequence for codons -114 through +185 of Figure 5, 
when made in the living cell S. griseus , are acted upon in a manner to produce in the culture medium 
externally of the living cells the mature bioactive enzymes protease A and protease B. The processing 

40 involved in accordance with the contained information encoded by that portion of the gene from start of the 
promoter to start of the mature protein in each case included providing a secretory address, the correct 
signal peptide processing site, the necessary propeptide structure not only for secretion but also for correct 
disulphide bond formation concomitant with secretion, and competent secretion in bioactive form. 

In accordance with this invention, the ability of the signal peptide to direct the secretion of bioactive 

45 protein was established by inserting known DNA sequences at the beginning and at the end of known 
sequences. For example, consider the sequence shown in Figure 5. In particular, the promoter and initiator 
ATG of the aminoglycoside phosphotransferase gene, (Thompson. C.J., and G.S. Gray (1983). Nucleotide 
sequence of a streptomycete aminoglycoside phosphotransferase gene and its relationship to phosphotran* 
sferases encoded by resistance plasmids . Proc. Natl. Acad. Sci. USA, 80:5190-5194) had been inserted 

50 preceding the second codon (AGG at -113) of the signal sequence of Figure 5. Due to the insertion of this 
new promoter and initiator, the sprB gene, now under the control of this non-native promoter, directed both 
elevated levels and earlier expression of proteolyfic activity when compared with the unaltered sprB gene. 
The secretion of bioactive protease B in this construction indicated that nucleic acid sequences preceding 
the GTG initiation codon at -114 are not required for the correct secretion of the protease B in bioactive 

55 form, provided an active and competent promoter is placed in the precise location indicated. 

In order further to demonstrate the universality of the discovered signal peptide, the sprB coding region 
was replaced with a gene sequence encoding the mature amylase from S. griseus . Hence the nucleic acid 
sequence encoding the amylase was inserted in place of the sequence of Figure 5 to the right of the light 
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arrow. It was determined that the resulting genetic construction directed the production of an extracellular 
protein having an N-terminal alanine, properly positioned intramolecular disulphide bonds, and exhibiting 
amylolytic activity at a level comparable to that of a similar construction with the natural signal peptide of 
amylase. In accordance with this invention, the 38 amino acid signal peptide of Figures 4 and 4A and 5 and 
5A is sufficient to direct the secretion of non-native protein in bioactive form. 

Since both signal sequences encode for the signal peptides of Figures 4 and 4A and 5 and 5A. the 
organization of the coding regions of sprA and sprB were investigated by comparing the amino acid 
homology of the encoded peptide sequences. Such comparisons are set out in Figure 6 where amino acid 
homology has been compared for the signal peptide of Figure 6a» the propeptide of Figure 6b and the 
mature protease of Figure 6c. A summary of such homology is provided in the following Table I. 

TABLE I 



Homology of sprA and sprB Coding Regions 




Length (codons) 


Protein Homology % 


DNA Homology % 


Signal 


38 


50 


58 


Propeptide 


79 


43 


62 


NT protease" 


87 


45 


58 


CT protease** 


103 


75 


75 


Total Protease 


190 


61 


67 


Total coding region 


307 


55 


65 



a ami no-termini of mature proteases (amino acids 1-87) 
b carboxy-termini of mature proteases (amino acids 88-190) 



The alignment of amino acid sequences translated from the coding regions of the spr A and sprB genes 
indicates an overall homology of 54% on the basis of amino acid identity. As indicated in Table I. the 
sequence homology is not uniformly distributed throughout the coding region of the sprA and sprB genes. 
The carboxy-terminal domains of the proteases A and B are 75% homologous as noted under the heading 
"CT protease" whereas the average homology for the remainder of the coding region is only 45%, indicated 
under the heading "NT protease". The amino terminal domains containing the signal and propeptide 
regions were similar in both extent of homology and distribution of consensus sequences, as indicated 
under the headings "signal" and "propeptide". The unexpectedly high DNA sequence homology relative to 
that of the protein sequences is particularly due to the 61% conservation in the third position of each codon 
of the sequence. These investigations, revealing the close homology between spr A and sprB genes, 
suggest that both genes originated by duplication of a common ancestral gene. With appropriate care and 
investigation, the commonality of the signal peptides can be determined, thus establishing the cue for 
secretion of proteins and hence providing sufficient information to construct, from the signal DNA of spr A 
and sprB, a single nucleic acid sequence which will be competent to direct protein secretion. 

irTaccordance with the invention, a recombinant DNA sequence can be developed which encodes for 
desired protein where the expressed protein, in conjunction with the signal peptide and optionally the 
propeptide, provide for secretion of the desired protein in bioactive form. The recombinant DNA sequence 
may be inserted in a suitable vector for transforming a desired cell for manufacturing the protein. Suitable 
expression vectors may include plasmids and viral phages. As is appreciated by those skilled in the art, the 
bioactivity of secretory proteins is assured by establishing the correct configuration of intramolecular 
disulphide bonds. Thus, suitable prokaryotic hosts may be selected for their ability to display enzymatic 
activity of a type typified by. but not limited to, that of protein disulphide oxidoreductase, EC 5.3.4.1. 

The particular protein encoded by the recombinant DNA seqence may include eukaryotic secretory 
enzymes, such as prochymosin, chymotrypsin, trypsins, amylases, ligninases. chymosin, elastases. lipases, 
and cellulases; prokaryotic secretory enzymes such as glucose, isomerase. amylases, lipases, pectinases. 
cellulases, proteinases, oxidases, lignises; blood factors, such as Factor VIII and Factor IX and Factor VIII- 
related biosynthetic blood coagulant proteins; tissue-type plasminogen activator: hormones, such as 
proinsulin; lymphokines. such as beta and gamma-interferon. and interleukin-2: enzyme inhibitors, such as 
extracellular proteins whose action is to destroy antibiotics either enzymatically or by binding, for example, 
a B-lactamase inhibitor, a-trypsin inhibitor; growth factors, such as organism or nerve growth factors, 
epidermal growth factors, tumor necrosis factors, colony stimulating factors; immunoglobulin-related mol- 
ecules, such as synthetic, designed, or engineered antibody molecules; cell receptors, such as cholesterol 
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receptor; viral molecules, such as viral hemaglutinins, AIDS antigen and immunogen. hepatitis B antigen 
and immunogen. foot-and-mouth disease virus antigen and immunogen; bacterial surface effectors, such as 
protein A; toxins such as protein insecticides, algicides. fungicides, and biocides; and systemic proteins of 
medical importance, such as myocardial infarct protein (fvlIP). weight control factor (WCF). calloric rate 
5 protein (CRP) and hirutin (HRD). 

One skilled in the art can easily determine whether the use of any known or unknown organism will be 
within the scope of this invention in accordance with the above discussion and the following examples. 

Microorganisms which may be useful in this respect as potential prokaryotic expression hosts include: 
Order: 

10 Actinomycetales ; Family: Actinomycetaceae Genus: Matruchonema , Lactophera ; Family Actinobacteria : 

Genus Actinomyces, Agromyces . Arachina , Arcanobacterium . Arthrobacter , Brevibacterium . Cellulomonas . 

Curtobacterium, Microbacterium . Oerskovia . Promicromonospora . Reni bacterium . Rothia ; Family Ac- 

tinoplanetes: Genus Actinoplanes . Dactylosporangium . Micromonospora ; Family Nocardioform ac- 

tinomycetes : Genus Caseobacter . Corynebacterium , Mycobacterium . Nocardia , Rhodococcus ; Family Strep- 
75 tomycetes: Genus Streptomyces . Streptoverticillium ; Family Maduromycetes : Genus Actinomadura . Excel- 

lospora. Microspora , Ptanospora , Spirillospora , Streptosporangium ; Family Thermospora : Genus Actinosyn- 

nema. Nocardiopsis , Thermophilla : Family Microspora : Genus Actionospora . Saccharospora ; Family Ther- 

moactlnomycetes: Genus Thermoactinomyces ; and the other prokaryotic genera: Acetivibrio , Acetobacter . 

Achromobacter. Acinetobacter , Aeromonas . Bacterionema . Bifidobacterium , Flavobacterium . Kurthia , Lac- . 
20 tobacillus , Leuconostoc , Myxobacteria . Propioni bacterium . 

The following species from the genus Streptomyces are identified as particularly suitable as hosts: 

acidophillus , albus. amylolyticus . argentiolus , aureofaciens . aureus , candidus , cellostaticus , cellulolyticus . 

coelicolor, creamorus . diastaticus , farinosus . flaveolus , fiavogriseus , fradiae , fulvoviridis . fungicidicus , 

gelaticus , glaucescens , globisporus , griseolus , griseus . hygroscopicus . ligninoiyticus . lipolyticus , lividans, 
25 moderatus . oiivochromogenus . parvus , phaeochromogenes . plicatus . proteolyticus . rectus , roseolus. 

roseoviolaceus. scabies , thermolyticus , tumorstaticus . venezuelae . violaceus . violaceus-ruber . violascens , 

and viridochromogenes. 

acrimycini 

alboniger 
30 ambofaciens 

antibioticus 

aspergilloide s 

chartreusis 

clavuligerus 
35 diastatochromogenes 

echinatus 

erythraeus 

fendae 

griseofuscus 
40 kanamyceticus 

kasugaensis 

koganeiensi s 

lavendulae 

parvulus 
45 peucetius 

reticuli 

rimosus 

vinaceus 

Also, the following eukaryotic hosts are potentially useful in the practice of this invention: 
50 Absidia . Acremonium . Acrophialopora . Acrospeira , Alternaria , Arthrobotrys . Ascotricha . Aureobasidium, 
Beauveria, Bispora. Bjerkandera . Calocera . Candida , Cephaliophora , Cephalosporium , Cerinomyces . 
Chaetomium. Chrysosporium , Circinella , Cladosporium , Cliomastix . Coccospora , Cochliobolus . Cunnin- 
ghamella . Curvularia , Custingophara , Dacrymyces . Dacryopina:^ , Dendryphion . Dictosporium . Doratomyces . 
Drechslera, Eupenicillium . Flammulina . Fusarium , Gliocladium . Gliomnastix . Graphium . Hansenula . 
65 Humtcola, Hyalodendron . Isaria . Kloeckera . Kluyveromyces . Lipomyces . Mammaria . Merulius . Microascus . 
Monodictys . Monosporium . Morchella , Mortierella , Mucor , Myceliophthora , Mycrothecium . Neurospora , 
Oedocephalum . Oldiodendron , Pachysolen . Papularia . Papulaspora . Penicillium , Peniophora , Periconia , 
Phaeocoriolellus, Phanerochaete , Phialophora , Piptocephalis , Pleurotus , Preussia . Pycnoporus , Rhion- 
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cladiella, Rhizomucor, Rhizopus , Rhodotorula , Robillarda . Saccharomyces , Schwanniomyces , 
Scolecabasidium. Scopulariopsrs , Scytalidium , Stachybotrys . Tetracluium , Thamnidium , Thermioascus . 
Thermomyces. Thiclavia, Tolypocladium , Torula, Torulopsis , Trametes. Trrcellula. Trichocladium > 
Trichoderma. Trichurus . Truncatella , Ulocladium . Ustilago . Verricuilium , Wardomyces . Xylogone, Yarrowia. 
5 Preferred embodiments of the invention are exemplified in the following procedures. Such procedures 
and results are by way of example and are not intended to be in any way limited to the scope of the claims. 

PREPARATIONS 
10 Strains and Plasmids 

Streptomyces griseus (ATCC 15395) was obtained from the American Type Culture Collection. 
Streptomyces lividans 66 (Bibb, MJ.. J.L Schottel, and S.N. Cohen (1980), A DNA cloning system for 
interspecies gene transfer in antibiotic-producing Stretomyces . Nature 284:526-531) and the plasmids plJ61 

75 and plJ702 from the John Innes Institute; Thompson. C.J., T. Kieser. J.M. Ward, and D.A. Hopwood (1982), 
Physical analysis of antibiotic-resistance genes from Streptomyces and their use in vector construction , 
Gene 20:51-62; Katz, E., CJ. Thompson, and D.A. Hopwood (1983), Cloning and expression of the 
tyrosinase gene from Streptomyces antibioticus in Streptomyces lividans , J. Gen. Microbiol.. 129:2703- 
2714). E. coli strain HB101 (ATCC 33694) was used for all transformations. Plasmids pUC8, pUCl8 and 

20 pUC19 werel)urchased from Bethesda Research Laboratories. 

Media, Growth and Transformation 

Growth of Streptomyces mycelium for the isolation of DNA or the preparation of protoplasts was as 
25 described in Hopwood, DA, M.J. Bibb, K.F. Chater, T. Kieser, C.J. Bruton. H.M. Kieser, D.J. Lydiate, CP. 
Smith. J.M. Ward, and H. Schrempf (1985), Genetic Manipulation of Streptomyces . A Laboratory Manual , 
The John Innes Foundation, Norwich, UK. Protoplasts of S. lividans were prepared by lysozyme treatment, 
transformed with plasmid DNA. and selected for resistance to thiostrepton, as described in Hopwood. D.A., 
M.J. Bibb, K.F. Chater. T. Kieser, C.J. Bruton. H.M. Kieser. D.J. Lydiate, CP. Smith, J.M. Ward, and H. 
30 Schrempf (1985), Genetic Manipulation of Streptomyces . A Laboratory Manual , The John Innes Foundation, 
Norwich, UK. Transformants were screened for proteolytic or amylolytic activity on LB plates containing 30 
ug/ml thiostrepton. and either 1% skim mitk or 1% corn starch, respectively. E. coli transformants were 
grown on YT medium containing 50 ug/ml ampicillin. 

35 Materials 

Oligonucleotides were synthesized using an Applied Biosystem 380A DNA synthesizer. Columns, 
phosphoramldites, and reagents used for oligonucleotide synthesis were obtained from Applied Biosystems. 
Inc. through Technical Marketing Associates. Oligonucleotides were purified by polyacrylamide gel elec- 
40 trophoresis followed by DEAE cellulose chromatography. Enzymes for digesting and modifying DNA were 
purchased from New England Biolabs and used according to the supplier's recommendations. 
Radioisotopes [a-32P]dATP ( 3000 Cl/mmol) and [7-32PJATP (-3000 Ci/mmol) were from Amersham. 
Thiostrepton was donated by Squibb. 

45 EXAMPLE 1 - Isolation of DNA 

Chromosomal DNA was isolated from Streptomyces as described in Chater. K.F., D.A. Hopwood. T. 

Kieser, and CJ. Thomson (1982), Gene cloning in Streptomyces . Curr. Topics Microbiol. Immunol., 96:69- 

95. except that sodium dodecyl sarcosinate (final cone. 0.5%) was substituted for sodium dodecyl sulfate. 
50 Plasmid DNA of transformed S. lividans was prepared by an alkaline lysis procedure as set out in Hopwood, 

D.A., M.J. Bibb, K.F. Chater. T. Kieser. CJ. Bruton, H.M. Kieser, D.J. Lydiate, CP. Smith, J.M. Ward, and H. 

Schrempf (1985), Genetic Manipulation of Streptomyces . A Laboratory Manual , The John Innes Foundation. 

Norwich, UK. Plasmid DNA from E. coli was purified by a rapid boiling method (Holmes. D.S.. and M.. 

Quigley (1981). A rapid boiling method for the preparation of bacterial plasnhids . Anal. Biochem., 114:193- 
55 197). DNA fragments and vectors used for all constructions were separated by electrophoresis on low 

melting point agarose, and purified from the molten agarose by phenol extraction. 
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EXAMPLE 2 - Construction of Genomic Library 

Chromosomal DNA of S. griseus ATCC 15395 was digested to completion of BamHI and fractionated 
by electrophoresis on a 0.8% low melting point agarose geL DNA fragments ranging in size from 4 to 12 

5 kilobase pairs (kb) were isolated from the agarose gel. The plasmid vectors pUC18 and pUCl9 were 
digested with Bam HI, and treated with calf intestinal alkaline phosphatase (Boehringer Mannheim). The S. 
griseus BamHI fragments (0.3 ug) and vectors (0.8 ug) were ligated in a final volume of 20 ul as described 
in Maniatis. T., E.F. Fritsch, and J. Sambrook (1982), Molecular Cloning . A Laboratory Manual , Cold Spring 
Harbor Laboratory. Cold Spring Harbor, NY). Approximately 8000 transformants of HB101 were obtained 

10 from each ligation reaction. 

EXAMPLE 3 - Subcloning of Protease Gene Fragments 

A hybrid Streptomyces -E. coll vector was constructed by ligating plJ702, which had been linearized by 
75 BamHI, into the Bam HI site of pUC8. The unique Bglll site of this vector was used for subcloning Bam HI 
and Bglll fragments fo the protease genes. Other fragments were adapted with Bam HI linkers to facilitate 
ligation into the Bglll site. The hybrid vector, with pUC8 inserted at the Bam HI site of plJ702, was incapable 
of replicating Streptomyces . However, the E. coli plasmid could be readily removed prior to transforming S. 
Ilvidans by digestion with Bam HI followed by recircularization with T4 ligase. 

20 

EXAMPLE 4 - Construction for Testing the sprB Signal Peptide 

The 0.4 kb Sau3AI -Nco l fragment containing the aminoglycoside phosphotransferase gene promoter 
was isolated from plJ61 and subcloned into the Bam HI and Ncol sites of a suitable vector. The Ncol site 

25 containing the initiator ATG was joined to the Mlul site of the sprB signal using two 43-mer oligonucleotides, 
which reconstructed the amino-terminus of the signal peptide. An amylase gene of S. griseus was adapted 
by ligating a 14-mer PstI linker to a Sma l site in the third codon. This removed the signal peptide and 
restored the amino-terminus of the mature amylase. The Haell site of the spr B signal was joined to the PstI 
site of the amylase subclone using two 26-mer oligonucleotides, which reconstructed the carboxy-terminus 

30 of the signal peptide. 

EXAMPLE 5 - Hybridization 

A 20-mer (5'TTCCC(C/G)AACAACGACTACGG3') oligonucleotide was designed from an amino acid 
35 sequence (FPNNDYG) which was common to both proteases. For use as a hybridization probe, the 
oligonucleotide was end-labelled using T4 polynucleotide kinase (New England Biolabs) and [7-32P]ATP. 
Digested genomic or plasmid DNA was transferred to a Hybond-N nylon membrane (Amersham) by 
electroblotting and hybridized in the presence of formamide (50%) as described in Hopwood, D.A.. M.J. 
Bibb. K.F. Chater, T. Kieser. C.J. Brulon. KM. Kieser. D.J. Lydiate. CP. Smith. J.M. Ward, and H. Schrempf 
40 (1985). Genetic Manipulation of Streptomyces. A Laboratory Manual , The John Innes Foundation, Norwich. 
UK. The filters were hybridized with the labelled oligonucleotide probe at 30 'C for 18h. and washed at 
47 'C. The S. griseus genomic library was screened by colony hybridization as described in Wallace, R.B., 
M.J. Johnson. T. Hirose. T. Miyake. E.H. Kawashima. and K. Itakura (1981). The use of synthetic 
oligonucleotides as hybridization probes. II. Hybridization of oligonucleotides of mixed sequence to rabbit 
45 giobin DNA , Nucl. Acids Res. 9:879-894. 

EXAMPLE 6 - DNA Sequencing 

The sequences of sprA and sprB were determined using a combination of the chemical cleavage 
50 sequencing method (Maxam, A., and W. Gilbert (1977), A new method for sequencing DNA . Proc. Natl. 
Acad. Sci. U.S.A., 74:560-564) and the dideoxy sequencing method (Sanger. F., S. Nicklen. and A.R. 
Coulson (1977), DNA sequencing with chain terminating inhibitors , Proc. Natl. Acad. Sci. U.S.A.. 
74:5463:5467). Restriction fragments were end-labeled using either polynucleotide kinase or the large 
fragment of DNA Polymerase I (Amersham), with the appropriate radiolabeled nucleoside triphosphate. 
55 Labeled fragments were either digested with a second restriction endonuclease or strand-separated, 
followed by electroelution from a polyacrylamide gel. Subclones were prepared in the Ml 3 bacteriophage 
and the dideoxy sequencing reactions were run using the -20 universal primer (New England Biolabs). In 
some areas of strong secondary structure, compressions and polymerase failure necessitated the use of 
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either inosine (Mills, D.R., and F.R. Kramer (1979). Structure independent nucleotide sequence analysis . 
Proc. Natl. Acad. Sci. U.S.A.. 76:2232-2235) or 7-dea2aguanosine (MIzusana, S., S. Nishimura. and F. Seela 
(1986). Improvement of the dideoxy chain termination nnethod of DNA sequencing by use of deoxy-7- 
deazoguanosine triposphate in place of dGTP , Nucleic Acids Res.. 14:1319-1324) analogs in the dideoxy 
5 reactions to clarify the sequence. The sequence were compiled using the software of DNASTAR*"" - 
(Doggette, P.E., and F.R- Blattner (1986), Personal access of sequence databases on personal computers . 
Nucleic Acids Res., 14:611-619). 

Claims 

10 Claims for the following Contracting States : AT, BE, CH, DE, FR, GB. GR, IT, LI, LU, NL, SE 

1. The DNA signal sequence of Rg. 4A. 

2. The DNA signal sequence of Fig. 5A. 

T5 

a A vector comprising the signal sequence defined in claim 1 or claim 2 and also a sequence encoding a 
desired protein fused thereto. 

4. The vector of claim 3, which is a plasmid or phage. 

20 

5. A transformed prokaryotic cell comprising the vector of claim 3 or claim 4. which is capable of 
expressing said sequences, as a fusion protein. 

6. The cell of claim 5, which is of the genus Streptomyces . 

25 

7. The cell of claim 6. which is S. lividans or S. griseus . 

6. A method for preparing a desired protein, which comprises culturing the cell of any of claims 5 to 7 in a 
nutrient medium, the fusion protein being produced as an intermediate and the signal sequence 
30 directing secretion of the desired protein from the cell. 

Claims for the following Contracting State : ES 

1. A process for preparing a DNA vector, comprising introducing the signal sequence of Fig. 4A or Fig. 5A 
35 and also a sequence encoding a desired protein fused thereto. 

2. The process of claim 1 . wherein the vector is a plasmid or phage. 

3- A process for preparing a transformed prokaryotic cell, comprising transformation with the vector of 
40 claim 1 or claim 2, whereby the cell is capable of expressing said sequences, as a fusion protein. 

4. The process of claim 3. wherein the cell is of the genus Streptomyces . 

5. The process of claim 4, wherein the cell is S. lividans or S. griseus . 

45 

6. A method for preparing a desired protein, which comprises culturing the cell of any of claims 3 to 5 in a 
nutrient medium, the fusion protein being produced as an intermediate and the signal sequence 
directing secretion of the desired protein from the cell. 

50 Patentanspriiche 

Patentanspruche flir folgende Vertragsstaaten : AT, BE, CH, DE. FR, GB, GR, IT. LI, LU. NL, SE 

1. DNA-Signalsequenz von Fig. 4A. 
55 2. DNA-Signalsequen2 von Fig. 5A. 

3. Vektor, der die in Anspruch 1 Oder Anspruch 2 definierte Signalsequenz sowie eine damit verkniipfte 
Sequenz. die fur ein gewGnschtes Protein kodiert, umfaBt. 
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4. Vektor nach Anspruch 3. der ein Plasmid Oder Phage ist. 

5. Transformierte prokaryotische Zelle. welche den Vektor nach Anspruch 3 Oder Anspruch 4 umfaBt, die 
imstande ist, die genannten Sequenzen als Fusionsprotein zu exprimieren. 

5 

6. Zelle nach Anspruch 5 der Gattung Streptomyces . 

7. Zetle nach Anspruch 6, die S. lividans Oder S. griseus ist. 

10 a Verfahren zur Herstellung eines gewunschten Proteins, welches das ZUchten der Zelle nach irgendei- 
nem der Anspruche 5 bis 7 in einem Nahrmedlum umfaBt, wobei das Fusionsprotein als Zwischenpro- 
dukt erzeugt wird und die Signalsequenz die Sekretion des gewunschten Proteins aus der Zelle steuert. 

PatentansprUche fUr folgenden Vertragsstaat : ES 

75 

1. Verfahren zur Herstellung eines DNA-Vektors. welches das EinfUhren der Signalsequenz von Fig. 4A 
Oder Fig. 5A sowie einer damit verknupften Sequenz, die fur ein gewUnschtes Protein kodiert. umfaBt. 

2. Verfahren nach Anspruch 1 , worin ein Vektor ein Plasmid Oder Phage ist. 

20 

a Verfahren zur Herstellung einer transformierten prokaryotischen Zelle, umfassend die Transformation 
mit dem Vektor nach Anspruch 1 oder Anspruch 2, wodurch die Zelle imstande ist, die genannten 
Sequenzen als Fusionsprotein zu exprimieren. 

25 4. Verfahren nach Anspruch 3, worin die Zelle von der Gattung Streptomyces ist. 

5. Verfahren nach Anspruch 4, worin die Zelle S. lividans oder S. griseus ist. 

6. Verfahren zur Herstellung eines gewunschten Proteins, welches das Zuchten der Zelle nach irgendei- 
30 nem der Anspruche 3 bis 5 in einem Nahrmedium umfaBt, wobei das Fusionsprotein als Zwischenpro- 

dukt erzeugt wird und die Signalsequenz die Sekretion des gewunschten Proteins aus der Zelle steuert. 

Revendications 

Revendications pour les Etats contractants suivants : AT, BE, CH, DE, FR, GB, GR, IT, U, LU, NL, SE 

35 

1. Sequence d'ADN signal de la figure 4A. 

2. Sequence d'ADN signal de la figure 5A. 

40 a Vecteur comportant la sequence signal difinie en revendication 1 ou revendication 2, ainsi que, 
fusionnee a ce vecteur. une sequence codant pour une prot^ine voulue. 

4. Vecteur selon la revendication 3, qui est un plasmide ou un phage. 

45 5. Cellule procaryote transform6e comportant le vecteur selon la revendication 3 ou la revendication 4, 
capable d'exprimer lesdites sequences sous la forme d*une prot^ine fusionnee. 

6. Cellule selon la revendication 5, du genre Streptomyces . 

60 7. Cellule selon la revendication 6, qui est S. lividans ou S. griseus . 

a Proc^d§ de preparation d'une prot^ine recherch^e, qui comporte la mise en culture de la cellule selon 
Tune quelconque des revendications 5^7 dans un milieu nutritif. la prot^ine fusionnee ^tant produite 
en tant qu'interm^diaire. et la sequence signal dirigeant la secretion de la prot^ine recherch^e hors de 
55 la cellule. 



11 



EP 0 300 466 B1 



Revendlcatlons pour I'Etat contractant sulvant : ES 

1. Procede de preparation d'un vecteur d'ADN comportant I'introductipn de la sequence signal de la figure 
4A ou de la figure 5A» ainsi que. fusionnee a ce vecteur. une sequence codant pour une proteine 
voulue. 

2. Procede selon la revendication 1. dans lequel le vecteur est un plasmide ou un phage. 

3. Proc^d^ de preparation d*une cellule procaryote transform^e. comportant la transformation avec le 
vecteur selon la revendication 1 ou la revendication 2, grace auquel la cellule est capable d'exprimer 
lesdites sequences sous la forme d'une proteine fusionnee. 

4. Procede selon la revendication 3, dans lequel la cellule est du genre Streptomyces . 

5. Precede selon la revendication 4. dans lequel la cellule est S. lividans ou S. griseus . 

6. Precede de preparation d'une proteine recherchee. qui comporte la mise en culture de la cellule selon 
Tune quelconque des revendications 3^5 dans un milieu nutritif. la proteine fusionnee 6tant produite 
en tant qu'intermediaire, et la sequence signal dirigeant la secretion de la proteine recherchee hors de 
la cellule. 
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A 

GTCGACCCCCATCTCATTCCGGGCTCGCGGGCGCGAATCCGGCCTTGCGTCAGGGACGGTCCCCGTCAACGAnC 



CAGCGTGCAACTTGGCAG6TTCACGCCCACTCCCACT6GGT6AGAACCTCGCGCACCAACGGCCCCACCTCACCC 

•116 

MTFKRFSPLSSTSR 

GACCGGGCCGTCCCCCCATACCTCGGAGGATCTCGTGACCnCAAGCGCTTCTCGCCGCTCAGCAGCACGTCAAG 
-100 -80 I 

YARLLAVAS6LVAAAALATPSAVAA 
ATATGCACGGCTCCTCGCCGTGGCCTCCGGCCTGGTG6CCGCCGCGGCCCTGGCCACCCCCTCGGCCGTCGCCGC 

-60 

PEAESKATVSQLADASSAILAADVA 
TCCCGAGGC66AGTCCAA6GCCACCGTTTCGCAGCTCGCCGACGCCAGCTCCGCCATCCTCGCCGCTGATGTGGC 

-40 

GTAWYTEASTGKIVLTADSTVSKAE 
GGGCACCGCCTGGTACACGGAGGCGAGCAC6GGCAAGATC6TCCTCACCGCCGACAGCACCGTGTCGAAGGCCGA 
-20 

LAKVSNALAGSKAKLTVKRAEGKFT 
ACTGGCCAAGGTCAGCAACGCGCTGGCGGGCTCCAAGGCGAAACTGACGGTCAAGCGCGCCGAGGGCAAGnCAC 

20 

PLIAGGEAITTGGSRC5LGFNVSVN 
CCCGCTGATCGCGGGCGGCGAGGCCATCACCACCGGTGGCAGCCGCTGnCGCTCGGCnCAACGTGTCGGTCAA 

40 

GVAHALTAGHCTNISASWSIGTRTG 
CGGCGTCGCCCACGCGCTCACCGCCGGCCACTGCACCAACATCAGCGCCAGCTGGTCCATCGGCACGCGCACCGG 

60 

TSFPNNDYGI I RHSNPAAA06RVYL 
AACCAGCTTCCC6AACAACGACTACGGCATCATCCGCCACTCGAACCCGGC6GCGGCCGACGGCCGGGTCTACCT 
80 

YNG5YQ0ITTAGNAFVGQAVQRSGS 
GTACAACGGCTCCTACCAGGACATCACGACGGCGGGCAACGCCinGTGGGGCAGGCCGTCCAGCGCAGCGGCAG 
100 120 
TTGLRSGSVTGLNATVNYGSSGIVY 
CACCACCGGGCTGCGCAGCGGCTCGGTCACCGGCCTCAACGCCACGGTCAACTACGGTTCCAGCGGGATCGTGTA 

140 

GMIQTNVCAEPGDSGG5LFAGSTAL 
CGGCAT6ATCCA6ACCAACGTCT6TGCCGAGCCCGGTGACAGT6GAGGCTCGCTCTTCGCGGGCAGCACCGCTCT 

160 

GLTSGGSGNCRTGGTTFYQP VTEAL 
GGGTCTCACCTCCGGCGGCAGTGGCAACTGCCGGACCGGCGGCACCACGnCTACCAGCCCGTCACCGAGGCGCT 

SAYGATVL* "^1-^ 

GAGCGCCTACGGGGCAACGGTCCTGTAGCCGGTGCCACCGGGGCTTCGGGCTGACCGCCGACCGGCCGCCCGAAG 

Z'iO.B 

CCCCGCGC6ACGCCCCACCCCGGCGGACCGTGCTCGCGCGCGGTCCGCCCTCGCC6TGCCAC6AACCCCACCGTC 



CTnCCCCGTCAGGCGCCTGCCGCTCGACCCGCATCGCGAAGnGCCGAGAGTGGCCGGCTCGCACCGGCACTGC 
TGAAGTCCTGCCCTCGCCCCACGGTCCGGnCGCGCCCGCCCGGACGCGGACCCGCGCCTGGGGAAGCCCTCACT 
CAACCCC6TTGCGCGCG6ATGAGGTCGCGATACCAGGCGAAGGAGGCCTTCGGGGTGCGGACCTGTGTCTCGTG6 



TC6AC 
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FIG4B. 




PEAESKATVSOLAOASSAILAADVA 
TCCC6AGGC66AGTCCAA6GCCACC6TTTCGCAGCTC6CCGACGCCAGCTCCGCCATCCTCGCCGCTGATGT6GC 

-40 

GTAWYTEASTGKIVLTADSTVSKAE 
GGGCACCGCCT6GTACACGGAGGC6AGCACGGGCAAGATCGTCCTCACCGCCGACAGCACC6TGTCGAAGGCC6A 
-20 

L A KVSNALAGSKAKLTVKR A E6KFT 
ACTGGCCAAGGTCAGCAACGCGCTGGCGGGCTCCAAGGCGAAACTGACGGTCAAGCGCGCCGAGGGCAAGTTCAC 



FIG.^C. 
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[4IAG6EAITTGGSRCSLGFNVSVN 
MCGCGGGCGGCGAGGCCATCACCACCGGTGGCAGCCGCTGTTCGCTC6GCTTCAACGTGTCG6TCAA 

40 

GVAHALTAGHCTN ISASWSIGTRTG 
CGGCGTCGCCCACGCGCTCACCGCCGGCCACTGCACCAACATCAGCGCCAGCTG6TCCATC6GCACGCGCACCG6 

60 

TSFPNNDYGI IRHSNPAAAD6RVYL 
AACCAGCTTCCCGAACAAC6ACTACGGCATCATCC6CCACTC6AACCCGGCG6CGGCCGAC6GCCGGGTCTACCT 
80 

YNG SYQDI TTAGNAFVGQAVQRSGS 
GTACAACGGCTCCTACCAGGACATCACGACGGCGGGCAACGCCTTTGTGGGGCAGGCCGTCCAGCGCAGCGGCA6 
100 120 
TTGLRSGSV TGLNATVNYGSSG I VY 
CACCACCGGGCTGCGCAGCGGCTCGGTCACCGGCCTCAACGCCACGGTCAACTACGGTTCCAGCGGGATCGTGTA 

140 

6HIQTNVCAE PGDSG G SLF AGSTAL 
CGGCATGATCCAGACCAACGTCTGTGCCGAGCCCGGTGACAGTGGAGGCTCGCTCTTCGCGGGCAGCACCGCTCT 

160 

6LTS6GS6MCRT6GTTFY0PVTEAL 
GGGTCTCACCTCCGGCGGCAGTGGCAACTGCCGGACCGGCGGCACCACGTTCTACCAGCCCGTCACCGAGGCGCT 
181 
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B 

CGCT6TGCCCGCCGTGCGCCTTCGCCGATCACTTCATCTGCCC6TTCCCGCCCCCGG6CAACACGCTCGCCGCGG 



-5V.0 



CG6TTTTGGCGGGGGAGCGGAACCGGATCGACGCCTGACCCGCGCGAGGCCCCACCGGCCCCGGCAGCCGCACGG 



CTCCC6GGGCCGGTGAC6GATGTGACCCGC6TGGCCGAAAGGCATTCTTGCGTCCCCC6TCC6GCCCCCTC6ATA 



CTCC6GTCAGCGATT6TCA6G6GCACG6C6AATTCGAAATCCGGACAGGCCCCCGACTGCGCCTCACGGGCCCGC 

HRItCRTSNRSN 

CACCCCACAGGAGGGCCCCCGAnCCCCTCGGAGGAACCCGAAGTGAGGATCAAGCGCACCAGCAACCGCTCGAA 
-loo -to 
AARRVRTTAVLAGLAAVAALAVPTA 
CGCGGCGAGACGCGTCCGCACCACCGCCGTACTCGCGGGGCTCGCCGCCGTCGCGGCGCTGGCCGTTCCCACCGC 
t -60 
N A E TPRTFSANQLTAASOAVLGADI 
GAACGCCGAAACCCCCCGGACGTTCAGTGCCAACCAGCTGACCGCGGCGAGCGACGCCGTGCTCGGCGCCGACAT 

-40 

AGTAWNIDPQSKRLVVTVDSTV5KA 
CGCGGGCACCGCCT6GAACATC6ACCCGCA6TCCAAGCGCCTC6TCGTCACCGTCGACAGCACG6TCTC6AA6GC 

EINQIKKSAGANAOALRIERTPGKF 
GGAGATCAACCAGATCAAGAAGTCGGCGGGCGCCAACGCCGACGCGCTGCGGATCGAGCGCACCCCCGGGAAGn 

TKLISGGDAIYSSTGRCSLGrNVRS 
CACCAAGCTGATCTCCGGCGGCGACGCGATCTACTCCAGCACCGGACGCTGCTCGCTCGGCnCAACGTCCGCAG 

GSTYYFLTAGHCTDGATTWWANSAR 
CGGCAGCACCTACTACTTCCTGACCGCCGGCCACT6CACG6ACGGCGCGACCACCTGGTGGGCGAACTCGGCCC6 

6o 

TTVLGTTS6SSFPNN0YGIVRYTNT 
CACCACGGTGCTCGGCACGACCTCCGGGTCGAGCnCCCGAACAACGACTACGGCATCGTGCGCTACACCAACAC 
90 

TIPKDGTVG60DITSAANATV6MAV 
CACCATTCCCAAG6ACGGCACGGTCGGCGGCCAGGACATCACCAGCGCCGCCAACGCCACCGTCGGCATGGCGGT 
too \zo 

TRR6STT6THS6SVTALNATVNYGG 
CACCCGCCGCGGCTCCACCACCGGCACCCACAGCGGnCGGTCACCGCACTCAACGCCACCGTCAACTACGGGGG 

mo 

GDVVYGMI RTNVCAEP60SGGPLYS 
CGGCGACGTCGTCTACGGCATGATCCGCACCAACGTGTGCGCGGAGCCCGGCGACTCCGGCGGCCCGCTCTACTC 

6TRAIGLT5GGSGNCS5GGTTFFQP 
CGGCACCCGGGCGATCGGTCTGACCTCCGGCGGCAGCGGCAACTGCTCCTCCGaGGCACGACCTTCTTCCAGCC 

\zs 

VTEALSAYGVSVY*^ 
GGTCACCGAGGCGCTGAGCGCGTACGGCGTCAGCGTGTACTGACCGGCCCCGCCCCGGTCGGGTACGGAGCAGTC 



CGTACAAACGT6CCCCC6TCCGGAATTCCG6ACGGGGGCTCCCGCTCGCCGGGGAGCTCTTGA6AGGAT6TCGCC 
ACGACGGGTCGCCGCTGCGCGTC 

FIG.5. 
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FIG.5B. 




A6TAWN l OPQSKRLVVTVOSTVS KA 
CGCGGGCACCGCCTG6AACATCGACCCGCA6TCCAAGCGCCTC6TCGTCACCGTCGACAGCAC6GTCTCGAAGGC 



-so 



E1N01KKSA6ANADALRIERTP6KF 
G6A6ATCAACCA6ATCAAGAA6TCGGCGGGCGCCAACGCCGACGCGCTGCG6ATCGAGCGCACCCCC6GGAAGTT 




99 
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FIG.5C. 




GSTYYFLTA6HCT0G ATTWWANSAR 
CGGCAGCACCTACTACTTCCTGACCGCCGGCCACTGCAC6GACGGCGCGACCACCTGGTGGGCGAACTC6GCCCG 



6o 

TTVLGTTSGSSFPHN0Y61VRYTNT 
CACCACGGTGCTCGGCAC6ACCTCC6GGTC6AGCTTCCC6AACAAC6ACTACGGCATCGTGCGCTACACCAACAC 

TIPKDGTVGGQDITSAANATVGMAY 
CACCATTCCCAAGGACGGCACGGTCGGCGGCCAGGACATCACCAGCGCCGCCAACGCCACCGTCGGCATGGCGGT 
too \ZO 

THRG5TT6THS6SVTALNATVNYGG 
CACCCGCCGCGGCTCCACCACCGGCACCCACAGCGGTTCGGTCACCGCACTCAACGCCACCGTCAACTACGGGGG 

mo 

GDVVY6 MI RT NVCAEP6DSGGPLYS 
CGGCGACGTCGTCTACGGCATGATCCGCACCAACGTGTGCGCGGAGCCCGGCGACTCCGGCGGCCCGCTCTACTC 

GTRAI6LTSGGS GNCSS6GT TFFQP 
CGGCACCCGGGCGATCGGTCTGACCTCCGGCGGCAGCGGCAACTGCTCCTCCGGCGGCACGACCTTCTTCCAGCC 
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FIG.6A. 

A 

sprA MTFKRFSPLSSTSRYARLLAVASGLVAAAALATPSAVA 
H KR S S R R AV GL A AALA P A A 
SprB MRIKRTSNRSNAARRVRTTAVLAGLAAVAALAVPTANA 

FIG.6B. 

B 

sprA APEAESKATVSOLAOASSAILAADVAGTAUYTEASTGKI 

A QL AS A L AO AGTAU 
SprB ETPRTFSAN--QLTAASDAVLGAOIAGTAUNIDPQSKRL 

sprA VLTADSTVSKAELAICVSNALAGSICAIC-LTVKRAEGICFTPL 
V T OSTVSKAE AG A L R GKFT L 

sprB VVTVDSTVSICAEINQIKICS-AGANADALRIERTPGICFTICL 

FIG.6C. 

C 

sprA lAGGEAITTGGSRCSLGFNVSVNGVAKALTAGHCTNIS 

I GG AI RCSLGFNV LTAGHCT 
sprB ISGGDAIYSSTGRCSLGFNVRSGSTYYFLTAGHCTDGA 

sprA ASUS IGTRTGTSFPNNDYGI IRHSNPAAA- 

U GT G SFPNNDYGI R N 

sprB TTUWANSARTTVLGTTSGSSFPNNDYGIVRYTNTTIPK 

sprA DGRVYLYNGSYQOITTAGNAFVGQAVQRSGSTTGLRSG 
DG V G ODIT A NA VG AV R GSTTG SG 
sprB DGTV GG-ODITSAANATVGMAVTRRGSTTGTHSG 

sprA SVTGLNATVNYGSSGIVYGMIQTNVCAEPGDSGGSLFA 

SVT LNATVNYG VYGMI TNVCAEPGDSGG L 
sprB SVTALNATVNYGGGDVVYGHIRTNVCAEPGDSGGPLYS 

sprA GSTALGLTSGGSGNCRTGGTTFYOPVTEALSAYGATVL 

G A GLTSGGSGNC GGTTF OPVTEALSAYG V 
sprB GTRAIGLTSGGSGNCSSGGTTFFQPVTEALSAYGVSVY 



94 



